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Abstract 

Purpose Employing representative data is necessary for pro¬ 
ducing a credible LCA informing decision making process. 
When the data is available from multiple sources, and in 
incompatible formats such as point estimates, intervals, ap¬ 
proximations, and may even be conflicting in nature, it is 
important to synthesize it with minimal loss of information 
to enhance the credibility of LCA. This article introduces a 
framework for information fusion that can serve this purpose 
within the current operational procedure of LCA. 

Methods The character of information gathered from multiple 
sources is inherently different than that exhibited by the infor¬ 
mation generated by a single random source. The framework 
of possibility theory can be used to merge such heterogeneous 
information as demonstrated by its application in the diverse 
fields such as engineering, finance, and social sciences. This 
article introduces this methodology for LCAs by first intro¬ 
ducing the theory behind data modeling and data fusion with 
possibility theory. Then, this framework is applied to the 
disparate data from literature on the manufacturing energy 
requirements for semiconductor device fabrication, and also 
to a hypothetical example of linguistic inputs from experts in 
order to demonstrate the operationalization of the theory. A 
flowchart is provided to recap the framework and for easy 
navigation through the steps of merging procedure. 

Results and discussion The framework for fusion of informa¬ 
tion applied the numerical and linguistic heterogeneous data in 
the LCA context illustrates that this methodology can be 
implemented relatively easily to increase the data quality 
and credibility of LCA. This can be done without making 
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any changes in the usual preferred way of conducting an LCA. 
Information fusion may be performed either after the sensitiv¬ 
ity analysis identifies the most impactful categories that need 
further investigation, or it can be performed upfront to the 
select input categories of interest. 

Conclusions The article introduces a well-established frame¬ 
work of information fusion to the field of LCA where dispa¬ 
rate data may need to be fused to perform the assessment 
under certain conditions. This framework can be easily imple¬ 
mented, and will enhance data quality and LCA credibility. 
We also hope that data entry software such as ecoEditor make 
provision for the data entry mechanism necessary to enter 
fused data. 

Keywords Data fusion • Data quality • Data 
representativeness • Epistemic uncertainty • Life cycle 
assessment • Possibility theory 

1 Introduction 

Representativeness of data is crucial in correctly characteriz¬ 
ing the environmental performance of a product system in Life 
Cycle Assessment (LCA). According to ISO 14044, represen¬ 
tativeness is defined as “qualitative assessment of the degree 
to which the data set reflects the true population of interest (i.e. 
geographical coverage, time period and technology cover¬ 
age)” (ISO 2006). In practice, few data used in LCA are 
provided with empirically measured statistical distribution 
information, and LCA practitioners often need to work with 
anecdotal data points or conflicting information from dispa¬ 
rate sources. LCA practitioners use, among others, pedigree 
framework introduced by Weidema and Wesnaes (1996) for 
characterizing data quality where representativeness is an 
important element. The framework has been widely adopted 
in Life Cycle Inventory (LCI) databases (Ecoinvent 2013; 
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Frischknecht et al. 2004), and it is now an integral part of data 
exchange formats such as EcoSpold version 2 (Ecoinvent 
2013) where data fields from 2,100 to 2,173 correspond to 
relevant criteria for assessing representativeness. 

Maintaining representativeness of data used is a universal 
challenge in an LCA, and it is further severed in the LCAs 
assessing new technologies, where information may be either 
immature or well guarded for business reasons (Gavankar 
et al. (2013); Heijungs and Huijbregts 2004). The sparse and 
imprecise data also introduces non-random epistemic uncer¬ 
tainty, which is driven by the lack of information, rather than 
variability in data (Reap et al. 2008; Couso and Dubois 2009; 
Dubois and Prade 1980; Dubois 2011; Chevalier and Teno 
1996; Benetto et al. 2005; Ardente et al. 2004; Andrae et al. 
2004; Heijungs and Huijbregts 2004; Huijbregts et al. 2001). 
This happens because often when the product is new, the 
information is gathered from multiple sources, such as patents, 
literature, and laboratory experiments, in order to have a 
reasonable estimate for an input (Khanna and Bakshi 2009; 
Healy et al. 2008; Meyer et al. 2010). Moreover, this infor¬ 
mation may be in incompatible formats such as point esti¬ 
mates, intervals, approximations or linguistic expressions, and 
may also be conflicting in nature. It is important then to 
synthesize this data with minimal loss of information in order 
to get a good estimate of the variable. Formalized information 
synthesis rooted in robust theory can help reduce epistemic 
uncertainty and enhance data quality, thereby increasing the 
credibility of LCAs. 

The acknowledgement of epistemic uncertainty due to 
imprecise data, the need to separate it from the random vari¬ 
ability and inability of probabilistic framework to process 
such uncertainty led to the application of fuzzy numbers in 
the LCA literature as early as in 1996 when Chevalier and 
Teno used intervals and min-max numbers to process the data 
inputs through LCA (Chevalier and Teno 1996). This was 
followed by the studies by Weckenmann and Schwan (2001) 
and Gonzalez et al. (2002) where the options of performing 
quick and inexpensive LCAs were explored with the help of 
fuzzy intervals. Ardente et al. (2004) then introduced fuzzy 
inference technique in the LCA literature, and von Bahr and 
Steen (2004) proposed reducing epistemological uncertainty 
by using fuzzy LCI. The article by Tan (2008) then formalized 
the processing for fuzzy LCI through the matrix-based LCI 
model introduced by Heijungs and Suh (2002). 

The fuzzy-number-based work in the LCA literature men¬ 
tioned above can be tied to the theoretical framework of possi¬ 
bility theory, and it is presented as such in the more recent LCA 
literature addressing epistemic uncertainty (Andrae et al. 2004; 
Benetto et al. 2005; Clavreul et al. 2013). Andrae et al. (2004) 
presented an overview of possibility theory in the context of 
uncertainty propagation in LCA. This discussion is carried 
forward by Clavreul et al. (2013) by illustrating the fundamental 
difference between the possibilistic and probabilistic 


representations of uncertainty and the importance of application 
of appropriate propagation methodology. 

Possibility theory can also be used to model and fuse 
subjective and conflicting information from multiple sources, 
and in incompatible formats, in order to obtain a more plau¬ 
sible, representative data point. This approach, though highly 
useful in enhancing data quality, has not yet been introduced 
in the LCA context. The Shonan Guidance Principle, which 
addresses the challenges in collection and management of 
imprecise raw data, advises the ranking of datapoints accord¬ 
ing to their quality according to the agreed upon criteria, and 
then picking the best representative data point (ex., 
Section 2.2.3) (UNEP 2011). However, the document does 
not provide guidance when ranking multiple data points to get 
the best one may not be possible. 

The objective of this paper is to introduce and apply pos¬ 
sibility theory to fuse multiple raw data that may be in conflict 
or given only in linguistic descriptions (Benoit and Mazijn 
2009). We proceed by first presenting the relevant part of 
possibility theory behind information fusion in the next sec¬ 
tion. Then, this framework is applied to the disparate numer¬ 
ical data from literature on the manufacturing energy require¬ 
ment for semiconductor device fabrication and to a hypothet¬ 
ical example of linguistic data on engineered nanomaterial 
(ENM) fate and transport behavior under certain conditions. 
We also recap this procedure in a flowchart at the end of the 
section. The article concludes with the discussion on where 
such an exercise will be most useful in enhancing the quality 
of the LCA results without drastically changing the current 
preferred practices of conducting the assessment, and the role 
the LCI databases can play in this endeavor. 

2 Fusion of data provided by multiple sources 

Much theoretical work on information synthesis is housed in 
the various branches of mathematics, with its numerous ap¬ 
plications already evident in the field of economics and fi¬ 
nance (Choobineh and Behrens 1992), engineering 
(Mourelatos and Zhou 2005), risk management (Baskerville 
and Portougal 2003). When information comes from multiple 
sources, the heterogeneity and subjectivity behind the as¬ 
sumptions and presentation of data make the character of 
information gathered inherently different from that exhibited 
by the information coming from a single random source if it 
were to produce this information (Dubois and Prade 1994). 
The latter represents a situation where the data can be proc¬ 
essed statistically and probabilistically. But merging of non- 
random and subjective information is an exercise in filtering 
less reliable information and seeking consistency out of im¬ 
precise information (Dubois and Prade 1994; Bloch et al. 
2001). In practice, processing this type of information has 
been carried out in LCA by expert judgment. However, this 
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type of synthesis can be performed with possibility theory 
with the help of appropriate fusion operations (Wolkenhauer 
1998; Yager 1983; Tanaka and Guo 1999; Dubois and Prade 
1994,2001; Dubois and Prade 2004). To do so, it is necessary 
to first model or represent the data appropriately. Accordingly, 
data modeling is presented ahead of the presentation on for¬ 
malized information fusion in the following text. 

2.1 Data modeling with fuzzy set theory 

Fuzzy sets are a generalization of the classical sets to account 
for the non-crisp information coming from linguistic and 
vague information (Zadeh 1978). In the classical set theory, 
it is clearly determined whether an item belongs to the set 
(e.g., “one” or “true”) or not (e.g., “zero” or “false”). Howev¬ 
er, in the fuzzy set theory, membership degrees or gradation 
exist between zero and one, and the members can have dif¬ 
ferent degrees of membership, called “membership function”, 
fi(x) in the interval [0,1]. Abundant literature and textbooks 
are available explaining the basics of fuzzy sets (Chaturvedi 
2008; Dubois and Prade 1980; Lee 2005). 

As illustrated in Fig. 1, p(x) is bound by 0 and 1, i.e., 
0</i(x)<l. Also, p(x) is normal, i.e., at least one element 
exists for which (i{x)= 1. The membership function fi(x) of a 
fuzzy variable can be also described in terms of a-cuts at 
different vertical levels oc (Fig. 1). Just like the membership 
function, the a-cuts are the slices through a fuzzy set producing 
regular classical (non-fuzzy) sets. In that sense, every fuzzy set 
is a collection of non-fuzzy or crisp sets at the levels of oc. 

Zadeh (1978) proposed an interpretation of membership 
functions of fuzzy sets as possibility distributions. It is impor¬ 
tant to note here that fuzzy and possibility theories appear 
similar under certain situations, but they are not the same. 
Fuzzy set theory models a proposition’s membership to the 
known tme value set, whereas possibility theory determines 
how likely the value is the tme value (Dubois et al. 2000; 
Dubois 2001). Conventions of fuzzy numbers however may 
be used to represent the data under the discussion of possibil¬ 
ity theory, as can be seen from the discussion below. 

This approach has already been used in the LCA setup 
(Ardente et al. 2004; Chevalier and Teno 1996; Tan 2008). 
Under this interpretation, the possibility of a value to be the 



Fig-1 The ot-level presentation in a fuzzy set or number 


tme value is estimated by using Eq. (1). This setup is mathe¬ 
matically same as the basic calculation of the membership 
fi{x) of x when x lies within the interval (a,c) with b as the 
most likely value. This interpretation will be recalled later in 
Section 3 for the discussion on the fusion of information 
presented in intervals: 


z x~a 


a 


a = < 


1 

c~x 

c~b 


5 


0, 


a<x < b 
x = b 
b > x>c 

otherwise 



2.2 Data modeling with possibility theory 
2.2.1 The basics of possibility theory 

Conceptually, possibility is the lower bound of probability, 
i.e., what is possible, must always be probable (Zadeh 1978). 
Formalization of possibility theory presented in this section is 
richly available in the literature (Tanaka and Guo 1999; Yager 
1983; Wolkenhauer 1998; Dubois and Prade 1988): If Q is a 
frame of discernment (i.e., possibility space) consisting of a 
set of possible values of to, then possibility distribution func¬ 
tion 7r (cu), where 0<7r (u;)< 1, represents the possibility of an 
element cv from the set Q. The condition of normalcy, i.e., the 
condition that at least one value from the set Q is the tme 
value, is assumed in the framework of possibility theory, and 
is formally expressed as: support i.e., sup {tt (c^): cu e H}=1. 
Also, a possibility measure n is different from a possibility 
distribution i r. The former is a supremum, i.e., the least upper 
bound set of the latter, and is defined as: 

77(4) = sup^ ey4 7r (co) = max({7r(cj)|a;e4}). (2) 


This measure of possibility n has its dual measure; it is 
called “necessity” N , which is always less than or equal to n, 
and represents the absolute certainty of a proposition. It is 
defined as N(A)= 1 -77 (A c ), i.e., the proposition is necessary 
or certain when the opposite is impossible. Here A c represents 
the compliment of A. 

An important aspect of possibility theory is that it is non¬ 
additive in order to account for subjective information which 
may not always add up. This feature separates it from the 
subjective branch of probability, such as Bayesian probability, 
which enforces probabilistic additive nature (Dubois et al. 
2000). The non-additivity indicates that the union of two 
independent propositions is not equivalent to their addition. 
Rather, the possibilistic union follows the theorem of 
maxivity: n (A u B )=MAX (n (A),11 ( B )). This means that 
when A and B are considered together, whichever is more 
easily possible will determine whether A u B happens or not. 
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Necessity measures satisfy an axiom dual to that of possibility 
measures, and N (A n i?)=MIN (N(A), 

2.2.2 Data modeling with possibility theory 

When the information is available in form of an interval, it can 
be refined, for example with the help of expert opinion, in that 
multiple nested intervals can be obtained from the experts 
with various levels of possibility. These nested possibilistic 
intervals may thus look like the fuzzy set in Fig. 1, with 
various possibilities representing alpha-cuts or membership 
levels much akin to the a-cuts. The nested nature of informa¬ 
tion can be more formally captured as follows (Dubois and 
Prade 1994): Let possibility distribution it (ce) represent a 
family of nested confidence subsets {A l9 A 2 , A m } where 
AicAi+i, i = 1, m -1 assuming that the set of possibility 
values is finite. Now, let A, be the level of confidence or belief 
that can be interpreted as the lowermost bound of the proba¬ 
bility that the tme value is in A f This makes X f a degree of 
necessity N(A t ) of A t . By definition, N(Aj)=l-U(A * c ) as 
indicated above, where II (A t ) is the maximum degree of 
possibility that the true value is in A t . Hence, the possibility 
distribution applicable to the family {{A 1? Ai), (A 2 , A 2 ),.--> 
(A m , A m )} is defined as the least specific possibility distribu¬ 
tion that obeys the constraints \ i =N(A i ), and reflects the 
minimum specificity principle as (Dubois and Prade 1994): 

/ \ _ f 1 if ujgA i , . 

^ otherwise 


The principle of minimum specificity states that any 
hypothesis not known to be impossible cannot be ruled 
out (Dubois and Prade 1988; Yager 1983). This logic 
makes the most specific opinion as the most restrictive 
and informative, and also makes a possibility distribution 
to be at least as specific as another one if and only if each state 
is at least as possible according to the latter as to the former 
(Yager 1983). Hence, for nested intervals, tt (ca)=l if ce 
belongs to A 1? with A i being the common subset of all the 
nested intervals by definition. Otherwise, tt (cj) is the mini¬ 
mum of all the (l-A / )s corresponding to the intervals that do 
not have uj . 

This logic of minimum specificity does not change when 
the information is presented as point values or sets of point 
values instead of intervals. Accordingly, the least specific 
possibility distribution compatible with the pairs of the possi¬ 
ble values {yOi) and respective beliefs (A,-) placed on them is 
presented as (Benferhat et al. 1997): 

{ min{l-A/|(4/A;) if c j^A f 

= l-max{A,|(^,A,)} if u&A,- (4) 

1, Otherwise 


2.3 Data fusion with possibility theory 

The fusion or merging of possibility distributions (i.e., the 
distributions that look like Eqs. (3) and (4) above) is based on 
set theoretic or logical methods of the combination of uncer¬ 
tain information. The main properties of these combination 
rules—as applicable to possibility theory—described in the 
literature are closure, commutativity, and adaptiveness 
(Dubois and Prade 1988). Closure refers to the property 
whereby the result of combination belongs to the same repre¬ 
sentation framework of the individual pieces of information. 
In the context of this article, this property implies that if the 
individual pieces of information are presented as possibility 
distribution, then their fusion will also be a possibility distri¬ 
bution. Compliance to commutativity indicates that changing 
the order of the operands (pieces of information to be fused, in 
this case) does not change the results. Lastly, the adaptiveness 
indicates that the amount of overlap between the pieces of 
information affects the choice of fusion operator. 

There are two basic modes of data fusion in possibility 
theory: the conjunctive mode when all the sources agree and 
are considered as reliable; and the disjunctive mode when 
sources disagree and at least one of them is wrong, but it is 
not known which one (Liu 2007; Dubois and Prade 2001). 
The choice between conjunction and disjunction is influenced 
by the degree of closeness in the information and the charac¬ 
teristics of their sources, such as datedness and the reliability 
(Liu 2007; Dubois and Prade 2001). The next step is to choose 
the procedure for performing conjunction or disjunction. The 
context of operation decides which procedure to follow. The 
main choices for conjunction of a and b are “min” (=min a ,£> ), 
“product” (= a • b) and “Lukasiewicz £-norm” (=max (0, a+b 
-1)). Their corresponding dual operations can be used for 
disjunctive fusion. They are “max” (=max (a,/?)), “probabi¬ 
listic sum” (=a+b-(crb)) and “Lukasiewicz t-conorm” 
(min(l, a+b)). 

As explained by Dubois and Prade (2004; 1994, 2001), 
conjunction can be operationalized as “product” if and only if 
the sources of information are independent of each other, and 
that there is no common background or communication be¬ 
tween them what so ever. Otherwise unjustified reinforcement 
of possibility may happen upon multiplication. Similarly, to 
use the “Lukasiewicz £-norm” one needs to make a drastic 
assumption that some of the sources are lying; in case of a two 
source operation, one of them is lying on purpose. On the 
other hand, the “min” operation corresponds to a purely 
logical view of the combination process and assumes that 
the source which assigns the least possibility degree to a given 
value is the best-informed with respect to this value. No 
reinforcement effect comes into play due to the idempotence 
of “min” operations. Considering these conditions, this article 
will follow the logic behind the “min” (and hence “max”) 
operation. 
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Hence, for the purpose of information fusion during data 
gathering stage of LCA, the possibility distributions (7r 1? 7r 2 , 
7r 3 ,....,7r w )onH can be merged either conjunctively (7r cm ) or 
disjunctively (tt dm ) as: 


Dubois and Prade (1994) extend this formalization to the 
fusion of 7Tj with i = It is represented in Eq. 9 with 7r cm , 
7r dm following the definition in Eq. (5) and/z cm following Eq. (6). 


TTcm (u)) = min,7r, (oj) 
ft dm (w) = max ( '7T ( (ui) 



uj£ 0, 



= max 


7T cm (a) 


h 


cm 


min(7t dm (cu) 

5 1 hem 



Upon conjunctive combination of intervals, the result may 
become subnormalized, i.e., it may happen that the real value 
may lie outside the conjunction. This is more likely to happen 
when the overlap between the intervals is not significant and 
the sources are in conflict. Normalization of the resulting 
conjunct interval can be achieved by dividing it by the con¬ 
sistency measure of information, i.e., dividing by the measure 
of overlap (Dubois and Prade 1994). When the information is 
available as intervals, it can be presented in triangular form as 
in Fig. 1, and the measure of consistency is the height of the 
intersection ( h ) of the overlap between the two triangles. 
Thus: 

vu*^, tt m = (6) 

tl (7T i, 7r 2 ) 

In Eq. (6), “h (tt i , 7r 2 )” is natural measure of overlap 
between the two possibility distributions. It is called the con¬ 
sistency index. Graphically, it is the height of intersection of 
7 T\ and 7T 2 as can be seen in Fig. 2 

h{ft 1,71-2) = sup wei? (7 T cm (w)) ( 7 ) 

Thus normalization focuses on the agreed upon values. But 
it is also important to keep track of the conflict. This is done by 
discounting the conjunctive fusion by the weight of inconsis¬ 
tency, which is defined as 1 —h (tt 1 , tt 2 ). This represents the 
degree of possibility that both sources are wrong, and mirrors 
the argument that both sources are supposed to be right under 
normalization, i.e., when h (7 T\, 7 t 2 )=1. However, the avail¬ 
able pieces of information may not be clear candidates for 
conjunctive or disjunctive fusion. This more general case is 
formalized by Dubois and Prade (1994) in an adaptive fusion 
rule (Eq. (8)) where neither conjunction nor disjunction pre¬ 
sents a suitable option for information fusion and both need to 
be cooperationalized in order to retain the consist information 
while not losing the sight of conflict. Equation (8) presents the 
adaptive fusion of two possibility distributions ( 7 r 1? tt 2 ). 


L 


minf7ri(o;), 7 t 2 (cj) 
17, = max 1 ' 


h (tt 1 ,7T 2 ) 


min ^max ^7 Ti (cj) , 7r 2 (o;)J, 1 —/z(tti , 



3 Information fusion in LCAs 

In this section, we will illustrate how the theoretical 
framework presented in earlier section can be applied to 
an LCA study. Consider an LCA an electronic product 
with one or more semiconductor devices. Semiconductor 
industry is one of the largest end users of energy 
(Gopalakrishnan et al. 2010) and, going by the EPA require¬ 
ment of semiconductor manufacturers’ mandatory greenhouse 
gas (GHG) reporting (EPA 2013), their GHG emissions is of 
concern. 

On this background, we will illustrate how using informa¬ 
tion fusion at data gathering stage can refine the global 
warming indicator result of a semiconductor CMOS (Com¬ 
plementary metal-oxide-semiconductor) wafer fabrication. 
By industry convention, the wafer is assumed to be ready 
with the printed CMOS devices. Hence, this process is also 
alternatively refereed as CMOS device fabrication. A run¬ 
ning example of manufacturing energy requirement is used 
to show the operations with numerical values. This exam¬ 
ple however, does not allow the demonstration of fusing 
linguistic opinion-set information. For this reason, we will 
resort to a very likely situation faced by a practitioner 
performing a nano-electronics LCA of estimating 
nanomaterial’s fate and transport. Here, hypothetical opin¬ 
ion sets on the fate and transport of manufactured 
nanomaterial will be fused to generate a robust possibility 
distribution of the material’s fate and transport pathways. 
The below four subsections represent four most common 
situations faced at the data gathering stage. 


3.1 Case 1—two data-points available 

We start with the two extreme CMOS device manufacturing 
energy requirement data points found in the literature: 0.8 E+ 
04 kWh/m 2 (ITRS 2001-5) and 2.5 E+04 kWh/m 2 (Yao et al. 
2004). Without further information, the mid-point, i.e., 
1.65 kWh/cm , can be considered as the most likely value 
(Tan 2008). Based on Eq. (1), this information can be modeled 
as the fuzzy triangular set with the a-cuts representing the 
possibility distribution as: 
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a 


< 


x-0.8 

1.65-0.8’ 

i, 

2.5-x 

2.5-1.65’ 

0, 


0.8<x < 1.65 
x = 1.65 

2.5>x > 1.65 
otherwise 


The possibility distribution for a given number within 
the range 0.8 to 2.5 E+04 kWh/m 2 can be calculated 
from this setup. These numbers and their corresponding 
global warming (GW) impact using TRACI 2 character¬ 
ization factors are provided in Table 1. As can be seen 
from Fig. 2, possibility distribution corresponding to the 
data in Table lean be thus visualized as a set of nested 
intervals. 

3.2 Case 2: information available as intervals that are not 
nested 


Table 1 Possibility distribution of GW due to CMOS wafer fabrication 


based on two data points 

a-cut or 
possibility 

0.8<x<1.65 in kWh/m 2 

1.65<x<2.5 in kWh/m 2 

Energy in 
kWh/m2 

GW in KG 
C0 2 Eq /m 2 

Energy in 
kWh/m2 

GW in KG 
C0 2 Eq/ m 2 

0.25 

1.01 E+04 

1.73 E+04 

2.29 E+04 

7.67E+03 

0.5 

1.22 E+04 

1.57 E+04 

2.08 E+04 

9.28E+03 

0.75 

1.44 E+04 

1.41 E+04 

1.86 E+04 

1.09E+04 


the intersection of the two triangular distributions as illustrated 
in Fig. 2, which also depicts the resulting fused possibility 
distribution. 

Representative calculations behind Fig. 3 can be seen in 
Table 2. The second-to-the-last column of corresponds 
to the fused distribution, with the last column providing re¬ 
spective GW information. 


When the information is available from disparate sources and 
in multiple intervals, it may not be so nicely nested. For 
CMOS device fabrication, the literature also provides three 
intervals: 0.8 to 1.4 E+04 kWh/m 2 (ITRS 2001-5), 1.2 to 
1.6 E+04 kWh/m adjusted value based on a 150-mm wafer 
(Schischke et al. 2001), and 1.4 to 1.6 E+04 kWh/m 2 adjusted 
value based on a 200-mm (Ciceri et al. 2010). In this group, 
since the third interval can be viewed as a subset of the second 
one, it will suffice to fuse information from the first two 
intervals. 

In order to use Eq. 8 for this task, we need to find the values 
of h, which in fact is the common intersection of the two 
triangular distributions represented by the two intervals. Set¬ 
ting common point a =h in Eq. (1) for both the intervals, we 
get /*=0.4, and (l-h)=0.6. Graphically, this is the height of 


Q 7S 


tJ 

o 


OS 


Q 2 S 



6 E + 0 3 9 E * G 3 lE+Oa Z E ♦ 0 4 2 £ + 0 4 

GW in KG CO3 Eq per m 3 of CMOS wafer 

Fig. 2 Graphical presentation of possibility distribution of data in Table 1 . 
The stacked columns represent possibility intervals on the 7-axis for their 
corresponding GW intervals on the X-axis 


3.3 Case 3: multiple values available from various sources 

Often, the information on inputs and outputs used to 
connect to LCI may be available as multiple point esti¬ 
mates. In these situations, a possibility estimation n can 
be assigned to these values by the experts. Since possibilities 
can be postulated as belief functions or likelihood functions 
(Shafer 1987; Dubois et al. 1995), they can be synthesized 
with the help of same set of equations as the previous 
example. 

Continuing with the wafer manufacturing energy exam¬ 
ple, literature offers some more data points as provided in 
the three right columns of Table 3. Without loss of gener¬ 
ality, we can assume that four experts ( E ) express their 
confidence A t on these numbers being the true value. These 
numbers are the first four columns in Table 3. As this table 
indicates, these expert opinions are not nested, nor is there 


— — — fused distribution 



Fig. 3 Fusion of distributions 7tl and tt 2, according to Eq. (8). Calcula¬ 
tions in Table 1 
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Table 2 Calculations for fused possibility distribution of energy requirement for CMOS wafer and corresponding GW 


ujj in 

E+04 kWh/m2 

TTl 

7T2 

min 

Gl 7T 2 ) 

A = 

(min(7Ti 7T 2 ))//z 

max 

Gi 7r 2 ) 

B =min (max 

Gi tt 2 )), 1 ~h 

7T (w)= 

max 

(A,B) 

GW in 
kgC0 2 eq 
per m 2 

0.9 

0.3 

0 

0 

0 

0.3 

0.3 

0.3 

6.82E+03 

1 

0.74 

0 

0 

0 

0.74 

0.6 

0.6 

7.58E+03 

1.1 

1 

0 

0 

0 

1 

0.6 

0.6 

8.34E+03 

1.2 

0.7 

0 

0 

0 

0.7 

0.6 

0.6 

9.09E+03 

1.25 

0.5 

0.25 

0.25 

0.625 

0.5 

0.5 

0.625 

9.47E+03 

1.28 

0.4 

0.4 

0.4 

1 

0.4 

0.4 

1 

9.70E+03 

1.3 

0.35 

0.5 

0.35 

0.875 

0.5 

0.5 

0.875 

9.85E+03 

1.35 

0.15 

0.7 

0.15 

0.375 

0.7 

0.6 

0.6 

1.02E+04 

1.4 

0 

1 

0 

0 

1 

0.6 

0.6 

1.06E+04 

1.45 

0 

0.75 

0 

0 

0.75 

0.6 

0.6 

1.10E+04 

1.6 

0 

0.25 

0 

0 

0.25 

0.25 

0.25 

1.21E+04 


any one row that clearly has A values higher than the other 
rows for the corresponding columns. In other words, these 
beliefs are conflicting. To fuse this information, we need to 
use Eq. (8). As the first step, using Eqs. (4) and (5), 7r cm is 
the minimum of A/S for a given cu, and 7r dm is the max¬ 
imum of A/S for a given uj. Thus h= 0.8 and (l—h)=0.2. 
Putting these values in the Eq. (8), we get Table 4 of fused 
possibility distribution. 

The data quality difference between individual point 
estimates and fused information is reflected in the im¬ 
pact assessment presented in the last row of Table 4. 
Due to linear relationship, the possibility distribution of 
energy requirement maps exactly to that of respective 
GW impact values. Putting these numbers in the context 
of US annual per capita (Kim et al. 2013), the most and 
the least possible normalized global warming potential 
(GWP) will also differ from the most possible range of 
0.37-0.44 to the least possible value of 0.84. This may 
be a big difference depending on the objective of the 
LCA. 


3.4 Case 4: information available as linguistic propositions 

The procedure for fusion similar when information is 
available as sets of linguistic propositions (Dubois and 
Prade 2004). Consider an LCA on the product enabled 
by an ENM, such as a T-shirt or a pair of socks with 
nanosilver or sport equipment with carbon nanotubes. In 
these assessments, the amount of ENM released at var¬ 
ious stages of life cycle, their physicochemical charac¬ 
teristic upon release and their chances of being biolog¬ 
ically available are often the questions of interest in 
order to populate the LCI. Often, some of these condi¬ 
tions either co-exist or contradict, and opinions may 
differ as to which combination of conditions is likely 
to occur. Even when numerical data are not available, a 
situation like this can be formulated qualitatively in the 
setting of possibility theory. Without loss of generality, 
consider a few simplistic LCI-relevant propositions on 
which some expert opinions on fate and transport at 
manufacturing of the ENM could be gathered: 


Table 3 Expert belief A on the given cu that it represents the true value 


A/ on the cu z by E t 



cu z in E + 04 
kWh/m 2 

Data from the 
wafer of size 
(mm) 

Supporting literature 

Ex 

E 2 

E 3 

E 4 


0.8 

0.8 

0.8 

0.9 

1.1 

300 

(Krishnan et al. 2008), one of the scenarios from 
2003 data from semiconductor 
industry consortium (Murphy et al. 2003) 

0.9 

1 

0.9 

0.7 

1.3 

300 

One of the scenarios from 2003 data from 

semiconductor industry consortium (Murphy et al. 
2003), merging of intervals from Table 1 

0.8 

0.8 

0.6 

0.7 

1.5 

150 

Adjusted for 300 mm (Williams et al. 2002) 

0.7 

0.7 

0.6 

0.7 

1.8 

200 

Adjusted for 300(Williams et al. 2002) 

0.3 

0.8 

0.7 

0.8 

2.5 

200 

Adjusted for 300 (Yao et al. 2004) 
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Table 4 Calculations for possibility distribution for the energy estimate for wafer fabrication based on the fusion three intervals and four point estimates 
coming from heterogeneous sources 


CO 

1.1 

1.3 

1.5 

1.8 

2.5 

TTctn 

0.8 

0.8 

0.6 

0.6 

0.3 

TTdm 

0.9 

1 

0.8 

0.7 

0.8 

h 

0.8 

1-h 

0.2 

A = min(l-h, 7r dm ) 

0.2 

0.2 

0.2 

0.2 

0.2 

B = 7T cm /h 

1 

1 

0.75 

0.75 

0.375 

TTfused = max (A,B) 

1 

1 

0.75 

0.75 

0.375 

GW (E+04 kg C0 2 eq / m 2 ) 

0.83 

0.98 

1.14 

1.36 

1.9 


col The ENM likely to release to air, quantity likely to be 
above the safety level 

cq2 The ENM likely to release to water and soil, quantity 
likely to be above the safety level 
co3 The ENM likely to release to soil, quantity likely to be 
below the safety level 
co4 The ENM likely not to release to air 
co5 The ENM likely to release to water, quantity below the 
safety level 

cq 6 The ENM likely not to release to soil 

There will be numerous subsets from various permutations 
and combinations of these opinions, and it can be assumed 
that the experts will be able to put their beliefs on at least some 
of them. Without loss of generality, it can be assumed that 
three experts ( E ) provided their beliefs (A) on some of the 
subsets of (jJi s as below: 

El {({tui, o> 5 }, 0.3), ({oh, o) 2 }, 0.5) ({cu 1; a> 6 , a> 5 }, 0.6)} 
E2 {({u> 4 , cu 5 }, 0.7), ({u> 4 }, 0.9), ({tu 6 , a> 4 }, 0.8), ({tu 6 , 

CU4, 104}, 0.8)} 

E3 {({a)!, u> 3 }, 0.4), ({u> 3 , cu 4 , cu 5 }, 0.4) ({tu 3> u> 5 }, 0.7) 

Possibility distributions can be derived from this informa¬ 
tion by the minimum specificity principle as represented in 
Eqs. (3) and (4), and are provided in Table 5. Applying 
Eqs. (4) and (8) to the above matrix, we get Table 6, with 
the second-to-the-last row representing the possibility distri¬ 
bution, which in fact can be interpreted as fuzzy membership 
degrees of the linguistic propositions represented by cui-cu 6 . 


Table 5 Possibility distributions derived from the expert opinions (be¬ 
liefs) on various subsets in the ENM example 



UJ X 

UJ2 

UJ 3 

UJ 4 

^5 

076 

TT^Efs 7T for UJi 

1 

0.4 

0.4 

0.4 

0.5 

0.5 

7 T 1 2 =E 2 , S 7 T for UJi 

0.1 

0.1 

0.1 

1 

0.1 

0.1 

7 t 1 3 =E 3 , S 7 T for UJi 

0.3 

0.3 

1 

0.3 

0.6 

0.3 


4 A recap on information fusion process 

Thus, as the above discussion illustrates, a rather simple 
procedure can be used to merge the disparate pieces of data 
in order to produce a more reliable estimate of intermediate 
input or an elementary LCI. A crisp flowchart of using this 
approach to various situations is provided in Fig. 4. 

As the flowchart illustrates, the procedure for information 
fusion will take slightly different routes depending on whether 
the data is linguistic or numerical, but once the respective 
route is taken, the flow of logic is similar. When the data is 
linguistic, the first step is to obtain associated belief values. 
Then, possibilities can be derived as in Table 5, and fused 
possibility distribution can be obtained as in Table 6. 

As the flowchart further illustrates, when the data is 
numerical, its merger procedure is determined by wheth¬ 
er it is available in intervals or point estimates. When 
the intervals are nested, the outermost bounds can be 
taken as minimum and maximum. This setup is similar 
to that explored in earlier LCA studies with fuzzy data 
(Ardente et al. 2004; Tan 2008; Tan et al. 2007; Che¬ 
valier and Teno 1996). When the intervals are not 
nested, then they can be merged to get possibility dis¬ 
tribution by first capturing the consistency ( h ), and then 
getting the values of 7r cm and 7r dm to input in Eq. (8). 
The case of two non-nested intervals can be generalized 
to accommodate more than two intervals. Lastly, when 
the estimates are available in data points, the case of 
two data points follows the path of nested intervals and 
the case of multiple data points follows the path similar 
to that followed by linguistic data points once the belief 
values are obtained. Depending on whether this infor¬ 
mation is a primary LCI flow or intermediate input, it 
can be processed further in the LCA framework. 

5 Conclusions and discussion 

This paper thus demonstrates how conflicting information 
from multiple sources can be synthesized in LCA using 
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Table 6 Calculation for possibility distribution for fused linguistic propositions represented by u i -uj 6 



00 l 

0)2 

0) 3 

0)4 

0) 5 

0) 6 

TTcm 

0.1 

0.1 

0.1 

0.3 

0.1 

0.1 

TTdni 

1 

0.4 

1 

1 

0.6 

0.5 

h 

0.3 

1-h 

0.7 

A = min(l-h, 7T dm ) 

0.7 

0.4 

0.7 

0.7 

0.6 

0.5 

B = 7t cm /h 

0.3 

0.3 

0.3 

1.0 

0.3 

0.3 

TTfused = max (A,B) 

0.7 

0.4 

0.7 

1 

0.6 

0.5 


information fusion technique. The technique is demonstrated 
for a case of electronic product LCA, and the overall proce¬ 
dure and decision points are elaborated using a flowchart. 

The question now is when these deliberate efforts—the 
extra mile—toward data synthesis are justified. The answer 
lies in the level of representativeness required to meet the goal 
of the study (Weidema 2000; Frischknecht et al. 2004; Lloyd 
and Ries 2008; Weidema and Wesnaes 1996; Ciroth et al. 
2004; ISO 2006). Two recurring topics from these discus¬ 
sions, namely, the need to account for—and hopefully to 
enhance—the quality of data used for LCAs, and the need to 
adherence to the ISO prescribed iteration of LCA exercise 
upon significant issues identification, are addressed, at least 
partially, by using the framework described in this article. The 
framework presented in this paper would be useful especially 
when significant issues identified are associated with 


conflicting data from multiple sources. The procedure de¬ 
scribed in this paper can be followed within the current 
framework of LCA, without requiring specialized software 
tool or skills beyond the understanding of basic logic. 

For example, an LCA can be conducted by a practitioner in 
any preferred way. Upon finding a set of the most significant 
LCI items or intermediate input categories via sensitivity anal¬ 
ysis, this type of fusion exercise can be performed to refine the 
values of the top contributing inputs, which can then be re¬ 
assessed for their impacts. Thus, with only a small additional 
step, the proposed method can be used to enhance the 
credibility of the key findings of the assessment. Alternatively, 
this approach can also be applied upfront only to those top 
intermediate input categories that are of particular interest to the 
stakeholders and therefore demand more representativeness. 
Estimation of LCIA ranges with their respective possibility 


Fig. 4 A flowchart illustrating 
which fusion route to use and how 
and when in merging inadequate 
information 
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information, akin to the wafer energy-GWP example above, 
may be highly desirable in the decision making context. 

In the interest of collective retention of the individual 
efforts and progression toward improved data quality, it might 
be useful for the LCI databases to absorb fused information in 
a transparent manner. On that front, we hope that the LCI data 
entry and management software tools, such as ecoEditor, 
make provisions to incorporate this type of fused data in their 
databases. Currently, there are limited approaches available to 
represent merged data in a transparent manner. Such provi¬ 
sions not only improve the representativeness of the data 
derived from disparate pieces of information, but also allow 
others to utilize underlying information without having to re¬ 
do the exercise. We believe that using fused data, either self 
generated or taken from established databases, will undoubt¬ 
edly enhance the representativeness of data used in LCA, 
thereby aiding LCA’s role as a credible decision informing 
tool. 
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